Ranking Chemical Structures for Drug Discovery: A New Machine Learning Approach

نویسندگان

  • Shivani Agarwal
  • Deepak Dugar
  • Shiladitya Sengupta
چکیده

With chemical libraries increasingly containing millions of compounds or more, there is a fast-growing need for computational methods that can rank or prioritize compounds for screening. Machine learning methods have shown considerable promise for this task; indeed, classification methods such as support vector machines (SVMs), together with their variants, have been used in virtual screening to distinguish active compounds from inactive ones, while regression methods such as partial least-squares (PLS) and support vector regression (SVR) have been used in quantitative structure-activity relationship (QSAR) analysis for predicting biological activities of compounds. Recently, a new class of machine learning methods - namely, ranking methods, which are designed to directly optimize ranking performance - have been developed for ranking tasks such as web search that arise in information retrieval (IR) and other applications. Here we report the application of these new ranking methods in machine learning to the task of ranking chemical structures. Our experiments show that the new ranking methods give better ranking performance than both classification based methods in virtual screening and regression methods in QSAR analysis. We also make some interesting connections between ranking performance measures used in cheminformatics and those used in IR studies.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Drug Discovery Acceleration Using Digital Microfluidic Biochip Architecture and Computer-aided-design Flow

A Digital Microfluidic Biochip (DMFB) offers a promising platform for medical diagnostics, DNA sequencing, Polymerase Chain Reaction (PCR), and drug discovery and development. Conventional Drug discovery procedures require timely and costly manned experiments with a high degree of human errors with no guarantee of success. On the other hand, DMFB can be a great solution for miniaturization, int...

متن کامل

Novel machine learning methods for computational chemistry

The experimental assessment of absorption, distribution, metabolism, excretion, toxicity and related physiochemical properties of small molecules is counted among the most timeand cost-intensive tasks in chemical research. Computational approaches, such as machine learning methods, represent an economic alternative to predict these properties, however, the limited accuracy and irregular error r...

متن کامل

Assessment of "drug-likeness" of a small library of natural products using chemoinformatics

Even though natural products has an excellent record as a source for new drugs, the advent of ultrahigh-throughput screening and large-scale combinatorial synthetic methods, has caused a decline in the use of natural products research in the pharmaceutical industry. This is due to the efficiency in generating and screening a high number of synthetic combinatorial compounds; whereas traditional ...

متن کامل

Assessment of "drug-likeness" of a small library of natural products using chemoinformatics

Even though natural products has an excellent record as a source for new drugs, the advent of ultrahigh-throughput screening and large-scale combinatorial synthetic methods, has caused a decline in the use of natural products research in the pharmaceutical industry. This is due to the efficiency in generating and screening a high number of synthetic combinatorial compounds; whereas traditional ...

متن کامل

StructRank: A New Approach for Ligand-Based Virtual Screening

Screening large libraries of chemical compounds against a biological target, typically a receptor or an enzyme, is a crucial step in the process of drug discovery. Virtual screening (VS) can be seen as a ranking problem which prefers as many actives as possible at the top of the ranking. As a standard, current Quantitative Structure-Activity Relationship (QSAR) models apply regression methods t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Journal of chemical information and modeling

دوره 50 5  شماره 

صفحات  -

تاریخ انتشار 2010